Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 74
Filtrar
1.
Brief Bioinform ; 25(3)2024 Mar 27.
Artículo en Inglés | MEDLINE | ID: mdl-38605642

RESUMEN

MicroRNAs (miRNAs) synergize with various biomolecules in human cells resulting in diverse functions in regulating a wide range of biological processes. Predicting potential disease-associated miRNAs as valuable biomarkers contributes to the treatment of human diseases. However, few previous methods take a holistic perspective and only concentrate on isolated miRNA and disease objects, thereby ignoring that human cells are responsible for multiple relationships. In this work, we first constructed a multi-view graph based on the relationships between miRNAs and various biomolecules, and then utilized graph attention neural network to learn the graph topology features of miRNAs and diseases for each view. Next, we added an attention mechanism again, and developed a multi-scale feature fusion module, aiming to determine the optimal fusion results for the multi-view topology features of miRNAs and diseases. In addition, the prior attribute knowledge of miRNAs and diseases was simultaneously added to achieve better prediction results and solve the cold start problem. Finally, the learned miRNA and disease representations were then concatenated and fed into a multi-layer perceptron for end-to-end training and predicting potential miRNA-disease associations. To assess the efficacy of our model (called MUSCLE), we performed 5- and 10-fold cross-validation (CV), which got average the Area under ROC curves of 0.966${\pm }$0.0102 and 0.973${\pm }$0.0135, respectively, outperforming most current state-of-the-art models. We then examined the impact of crucial parameters on prediction performance and performed ablation experiments on the feature combination and model architecture. Furthermore, the case studies about colon cancer, lung cancer and breast cancer also fully demonstrate the good inductive capability of MUSCLE. Our data and code are free available at a public GitHub repository: https://github.com/zht-code/MUSCLE.git.


Asunto(s)
Neoplasias del Colon , Neoplasias Pulmonares , MicroARNs , Humanos , Músculos , Aprendizaje , MicroARNs/genética , Algoritmos , Biología Computacional
2.
Comput Biol Med ; 172: 108301, 2024 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-38492453

RESUMEN

Accurately predicting the survival rate of cancer patients is crucial for aiding clinicians in planning appropriate treatment, reducing cancer-related medical expenses, and significantly enhancing patients' quality of life. Multimodal prediction of cancer patient survival offers a more comprehensive and precise approach. However, existing methods still grapple with challenges related to missing multimodal data and information interaction within modalities. This paper introduces SELECTOR, a heterogeneous graph-aware network based on convolutional mask encoders for robust multimodal prediction of cancer patient survival. SELECTOR comprises feature edge reconstruction, convolutional mask encoder, feature cross-fusion, and multimodal survival prediction modules. Initially, we construct a multimodal heterogeneous graph and employ the meta-path method for feature edge reconstruction, ensuring comprehensive incorporation of feature information from graph edges and effective embedding of nodes. To mitigate the impact of missing features within the modality on prediction accuracy, we devised a convolutional masked autoencoder (CMAE) to process the heterogeneous graph post-feature reconstruction. Subsequently, the feature cross-fusion module facilitates communication between modalities, ensuring that output features encompass all features of the modality and relevant information from other modalities. Extensive experiments and analysis on six cancer datasets from TCGA demonstrate that our method significantly outperforms state-of-the-art methods in both modality-missing and intra-modality information-confirmed cases. Our codes are made available at https://github.com/panliangrui/Selector.


Asunto(s)
Neoplasias , Calidad de Vida , Humanos , Neoplasias/diagnóstico por imagen
3.
Mol Ther Nucleic Acids ; 35(1): 102139, 2024 Mar 12.
Artículo en Inglés | MEDLINE | ID: mdl-38384447

RESUMEN

MicroRNAs (miRNAs) play a crucial role in the prevention, prognosis, diagnosis, and treatment of complex diseases. Existing computational methods primarily focus on biologically relevant molecules directly associated with miRNA or disease, overlooking the fact that the human body is a highly complex system where miRNA or disease may indirectly correlate with various types of biomolecules. To address this, we propose a novel prediction model named MHGTMDA (miRNA and disease association prediction using heterogeneous graph transformer based on molecular heterogeneous graph). MHGTMDA integrates biological entity relationships of eight biomolecules, constructing a relatively comprehensive heterogeneous biological entity graph. MHGTMDA serves as a powerful molecular heterogeneity map transformer, capturing structural elements and properties of miRNAs and diseases, revealing potential associations. In a 5-fold cross-validation study, MHGTMDA achieved an area under the receiver operating characteristic curve of 0.9569, surpassing state-of-the-art methods by at least 3%. Feature ablation experiments suggest that considering features among multiple biomolecules is more effective in uncovering miRNA-disease correlations. Furthermore, we conducted differential expression analyses on breast cancer and lung cancer, using MHGTMDA to further validate differentially expressed miRNAs. The results demonstrate MHGTMDA's capability to identify novel MDAs.

4.
Am J Cancer Res ; 13(8): 3342-3367, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37693148

RESUMEN

Emerging research indicates that circRNAs serve a crucial role in occurrence and development of cancers. This study aimed to uncover the biological role of hsa_circ_0000519 in the progression of LUAD (lung adenocarcinoma). hsa_circ_0000519 was identified by bioinformatic analysis, and its differential expression was validated in LUAD tissues and cell lines. CCK8, colony formation, wound healing, transwell assays, and xenograft tumor models were used to observe the biological functions of hsa_circ_0000519. FISH, RIP, dual luciferase reporter assays, and recovery experiments were implemented to explore the underlying mechanisms of hsa_circ_0000519. hsa_circ_0000519 was significantly upregulated in LUAD tissues and cell lines. The expression of hsa_circ_0000519 was positively correlated with T grade and TNM stage in patients with LUAD. Downregulation of hsa_circ_0000519 remarkably reduced cell proliferation, migration, invasion in vitro, and tumor growth in vivo. Mechanistic investigation demonstrated that hsa_circ_0000519 directly sponged hsa-miR-1296-5p to reduce its repressive impact on DARS as well as activate the PI3K/AKT/mTOR signaling pathway. The malignant phenotypes of LUAD cells induced by upregulation of hsa_circ_0000519 could be rescued by hsa-miR-1296-5p overexpression or knockdown of DARS. In conclusion, hsa_circ_0000519 promotes LUAD progression through the hsa-miR-1296-5p/DARS axis and may be expected as a novel biomarker and therapeutic for LUAD.

5.
Comput Biol Med ; 165: 107418, 2023 10.
Artículo en Inglés | MEDLINE | ID: mdl-37716243

RESUMEN

Early detection of Sepsis is crucial for improving patient outcomes, as it is a significant public health concern that results in substantial morbidity and mortality. However, despite the widespread use of the Sequential Organ Failure Assessment (SOFA) in clinical settings to identify sepsis, obtaining sufficient physiological data before onset remains challenging, limiting early detection of sepsis. To address this challenge, we propose an interpretable machine learning model, ITFG (Interpretable Tree-based Feature Generation), that leverages potential correlations between features based on existing knowledge to identify sepsis within six hours of onset using valuable and continuous physiological measures. Furthermore, we introduce a Semi-supervised Attention-based Conditional Transfer Learning (SAC-TL) framework to enhance the model's generality and enable it to be used for early warning of sepsis in the target domain with less information from the source domain. Our proposed approaches effectively address the problem of systematic feature sparsity and missing data, while also being practical for different degrees of generalizability. We evaluated our proposed approaches on open datasets, MIMIC and PhysioNet, obtaining AUC of 97.98% and 86.21%, respectively, demonstrating their effectiveness in different data environments and achieving the best early detection results.


Asunto(s)
Sepsis , Humanos , Sepsis/diagnóstico , Aprendizaje Automático Supervisado , Aprendizaje Automático , Diagnóstico Precoz , Salud Pública
6.
J Comput Biol ; 30(9): 1034-1045, 2023 09.
Artículo en Inglés | MEDLINE | ID: mdl-37707993

RESUMEN

Drug-drug interaction (DDI) is a key concern in drug development and pharmacovigilance. It is important to improve DDI predictions by integrating multisource data from various pharmaceutical companies. Unfortunately, the data privacy and financial interest issues seriously influence the interinstitutional collaborations for DDI predictions. We propose multiparty computation DDI (MPCDDI), a secure MPC-based deep learning framework for DDI predictions. MPCDDI leverages the secret sharing technologies to incorporate the drug-related feature data from multiple institutions and develops a deep learning model for DDI predictions. In MPCDDI, all data transmission and deep learning operations are integrated into secure MPC frameworks to enable high-quality collaboration among pharmaceutical institutions without divulging private drug-related information. The results suggest that MPCDDI is superior to other eight baselines and achieves the similar performance to that of the corresponding plaintext collaborations. More interestingly, MPCDDI significantly outperforms methods that use private data from the single institution. In summary, MPCDDI is an effective framework for promoting collaborative and privacy-preserving drug discovery.


Asunto(s)
Desarrollo de Medicamentos , Descubrimiento de Drogas , Interacciones Farmacológicas , Redes Neurales de la Computación , Preparaciones Farmacéuticas
7.
J Comput Biol ; 30(9): 961-971, 2023 09.
Artículo en Inglés | MEDLINE | ID: mdl-37594774

RESUMEN

Drug-drug interactions (DDIs) can have a significant impact on patient safety and health. Predicting potential DDIs before administering drugs to patients is a critical step in drug development and can help prevent adverse drug events. In this study, we propose a novel method called HF-DDI for predicting DDI events based on various drug features, including molecular structure, target, and enzyme information. Specifically, we design our model with both early fusion and late fusion strategies and utilize a score calculation module to predict the likelihood of interactions between drugs. Our model was trained and tested on a large data set of known DDIs, achieving an overall accuracy of 0.948. The results suggest that incorporating multiple drug features can improve the accuracy of DDI event prediction and may be useful for improving drug safety and patient outcomes.


Asunto(s)
Efectos Colaterales y Reacciones Adversas Relacionados con Medicamentos , Humanos , Interacciones Farmacológicas
8.
iScience ; 26(7): 107013, 2023 Jul 21.
Artículo en Inglés | MEDLINE | ID: mdl-37389184

RESUMEN

Exploring early detection methods through comprehensive evaluation of DNA methylation for lung squamous cell carcinoma (LUSC) patients is of great significance. By using different machine learning algorithms for feature selection and model construction based on The Cancer Genome Atlas (TCGA) and Gene Expression Omnibus (GEO) databases, five methylation biomarkers in LUSC (along with mapped genes) were identified including cg14823851 (TBX4), cg02772121 (TRIM15), cg10424681 (C6orf201), cg12910906 (ARHGEF4), and cg20181079 (OR4D11), achieving extremely high sensitivity and specificity in distinguishing LUSC from normal samples in independent cohorts. Pyrosequencing assay verified DNA methylation levels, meanwhile qRT-PCR and immunohistochemistry results presented their accordant methylation-related gene expression statuses in paired LUSC and normal lung tissues. The five methylation-based biomarkers proposed in this study have great potential for the diagnosis of LUSC and could guide studies in methylation-regulated tumor development and progression.

9.
Aging (Albany NY) ; 15(5): 1394-1411, 2023 02 24.
Artículo en Inglés | MEDLINE | ID: mdl-36863716

RESUMEN

Lipid metabolism plays an essential role in the genesis and progress of acute myocardial infarction (AMI). Herein, we identified and verified latent lipid-related genes involved in AMI by bioinformatic analysis. Lipid-related differentially expressed genes (DEGs) involved in AMI were identified using the GSE66360 dataset from the Gene Expression Omnibus (GEO) database and R software packages. Gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses were conducted to analyze lipid-related DEGs. Lipid-related genes were identified by two machine learning techniques: least absolute shrinkage and selection operator (LASSO) regression and support vector machine recursive feature elimination (SVM-RFE). The receiver operating characteristic (ROC) curves were used to descript diagnostic accuracy. Furthermore, blood samples were collected from AMI patients and healthy individuals, and real-time quantitative polymerase chain reaction (RT-qPCR) was used to determine the RNA levels of four lipid-related DEGs. Fifty lipid-related DEGs were identified, 28 upregulated and 22 downregulated. Several enrichment terms related to lipid metabolism were found by GO and KEGG enrichment analyses. After LASSO and SVM-RFE screening, four genes (ACSL1, CH25H, GPCPD1, and PLA2G12A) were identified as potential diagnostic biomarkers for AMI. Moreover, the RT-qPCR analysis indicated that the expression levels of four DEGs in AMI patients and healthy individuals were consistent with bioinformatics analysis results. The validation of clinical samples suggested that 4 lipid-related DEGs are expected to be diagnostic markers for AMI and provide new targets for lipid therapy of AMI.


Asunto(s)
Biología Computacional , Infarto del Miocardio , Humanos , Biomarcadores , Coenzima A Ligasas/genética , Bases de Datos Factuales , Lípidos , Infarto del Miocardio/diagnóstico , Infarto del Miocardio/genética , Fosfolipasas , Fosfolipasas A2 Grupo I/metabolismo
10.
NAR Genom Bioinform ; 5(1): lqad012, 2023 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-36789031

RESUMEN

Infectious diseases emerge unprecedentedly, posing serious challenges to public health and the global economy. Virulence factors (VFs) enable pathogens to adhere, reproduce and cause damage to host cells, and antibiotic resistance genes (ARGs) allow pathogens to evade otherwise curable treatments. Simultaneous identification of VFs and ARGs can save pathogen surveillance time, especially in situ epidemic pathogen detection. However, most tools can only predict either VFs or ARGs. Few tools that predict VFs and ARGs simultaneously usually have high false-negative rates, are sensitive to the cutoff thresholds and can only identify conserved genes. For better simultaneous prediction of VFs and ARGs, we propose a hybrid deep ensemble learning approach called HyperVR. By considering both best hit scores and statistical gene sequence patterns, HyperVR combines classical machine learning and deep learning to simultaneously and accurately predict VFs, ARGs and negative genes (neither VFs nor ARGs). For the prediction of individual VFs and ARGs, in silico spike-in experiment (the VFs and ARGs in real metagenomic data), and pseudo-VFs and -ARGs (gene fragments), HyperVR outperforms the current state-of-the-art prediction tools. HyperVR uses only gene sequence information without strict cutoff thresholds, hence making prediction straightforward and reliable.

11.
Interdiscip Sci ; 15(2): 262-272, 2023 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-36656448

RESUMEN

Differentiation of ductal carcinoma in situ (DCIS, a precancerous lesion of the breast) from fibroadenoma (FA) using ultrasonography is significant for the early prevention of malignant breast tumors. Radiomics-based artificial intelligence (AI) can provide additional diagnostic information but usually requires extensive labeling efforts by clinicians with specialized knowledge. This study aims to investigate the feasibility of differentially diagnosing DCIS and FA using ultrasound radiomics-based AI techniques and further explore a novel approach that can reduce labeling efforts without sacrificing diagnostic performance. We included 461 DCIS and 651 FA patients, of whom 139 DCIS and 181 FA patients constituted a prospective test cohort. First, various feature engineering-based machine learning (FEML) and deep learning (DL) approaches were developed. Then, we designed a difference-based self-supervised (DSS) learning approach that only required FA samples to participate in training. The DSS approach consists of three steps: (1) pretraining a Bootstrap Your Own Latent (BYOL) model using FA images, (2) reconstructing images using the encoder and decoder of the pretrained model, and (3) distinguishing DCIS from FA based on the differences between the original and reconstructed images. The experimental results showed that the trained FEML and DL models achieved the highest AUC of 0.7935 (95% confidence interval, 0.7900-0.7969) on the prospective test cohort, indicating that the developed models are effective for assisting in differentiating DCIS from FA based on ultrasound images. Furthermore, the DSS model achieved an AUC of 0.8172 (95% confidence interval, 0.8124-0.8219), indicating that our model outperforms the conventional radiomics-based AI models and is more competitive.


Asunto(s)
Neoplasias de la Mama , Carcinoma Intraductal no Infiltrante , Fibroadenoma , Humanos , Femenino , Carcinoma Intraductal no Infiltrante/diagnóstico por imagen , Carcinoma Intraductal no Infiltrante/patología , Inteligencia Artificial , Diagnóstico Diferencial , Fibroadenoma/diagnóstico por imagen , Fibroadenoma/patología , Estudios Prospectivos , Neoplasias de la Mama/diagnóstico por imagen , Ultrasonografía
12.
Brief Bioinform ; 24(1)2023 01 19.
Artículo en Inglés | MEDLINE | ID: mdl-36460622

RESUMEN

Drug response prediction in cancer cell lines is of great significance in personalized medicine. In this study, we propose GADRP, a cancer drug response prediction model based on graph convolutional networks (GCNs) and autoencoders (AEs). We first use a stacked deep AE to extract low-dimensional representations from cell line features, and then construct a sparse drug cell line pair (DCP) network incorporating drug, cell line, and DCP similarity information. Later, initial residual and layer attention-based GCN (ILGCN) that can alleviate over-smoothing problem is utilized to learn DCP features. And finally, fully connected network is employed to make prediction. Benchmarking results demonstrate that GADRP can significantly improve prediction performance on all metrics compared with baselines on five datasets. Particularly, experiments of predictions of unknown DCP responses, drug-cancer tissue associations, and drug-pathway associations illustrate the predictive power of GADRP. All results highlight the effectiveness of GADRP in predicting drug responses, and its potential value in guiding anti-cancer drug selection.


Asunto(s)
Antineoplásicos , Neoplasias , Humanos , Neoplasias/tratamiento farmacológico , Antineoplásicos/farmacología , Antineoplásicos/uso terapéutico , Benchmarking , Línea Celular , Aprendizaje
13.
BMC Med ; 20(1): 368, 2022 10 17.
Artículo en Inglés | MEDLINE | ID: mdl-36244991

RESUMEN

BACKGROUND: Considering the heterogeneity of tumors, it is a key issue in precision medicine to predict the drug response of each individual. The accumulation of various types of drug informatics and multi-omics data facilitates the development of efficient models for drug response prediction. However, the selection of high-quality data sources and the design of suitable methods remain a challenge. METHODS: In this paper, we design NeRD, a multidimensional data integration model based on the PRISM drug response database, to predict the cellular response of drugs. Four feature extractors, including drug structure extractor (DSE), molecular fingerprint extractor (MFE), miRNA expression extractor (mEE), and copy number extractor (CNE), are designed for different types and dimensions of data. A fully connected network is used to fuse all features and make predictions. RESULTS: Experimental results demonstrate the effective integration of the global and local structural features of drugs, as well as the features of cell lines from different omics data. For all metrics tested on the PRISM database, NeRD surpassed previous approaches. We also verified that NeRD has strong reliability in the prediction results of new samples. Moreover, unlike other algorithms, when the amount of training data was reduced, NeRD maintained stable performance. CONCLUSIONS: NeRD's feature fusion provides a new idea for drug response prediction, which is of great significance for precise cancer treatment.


Asunto(s)
MicroARNs , Neoplasias , Algoritmos , Humanos , Neoplasias/tratamiento farmacológico , Redes Neurales de la Computación , Reproducibilidad de los Resultados
14.
NAR Genom Bioinform ; 4(3): lqac057, 2022 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-35937545

RESUMEN

Temperate phages (active prophages induced from bacteria) help control pathogenicity, modulate community structure, and maintain gut homeostasis. Complete phage genome sequences are indispensable for understanding phage biology. Traditional plaque techniques are inapplicable to temperate phages due to their lysogenicity, curbing their identification and characterization. Existing bioinformatics tools for prophage prediction usually fail to detect accurate and complete temperate phage genomes. This study proposes a novel computational temperate phage detection method (TemPhD) mining both the integrated active prophages and their spontaneously induced forms (temperate phages) from next-generation sequencing raw data. Applying the method to the available dataset resulted in 192 326 complete temperate phage genomes with different host species, expanding the existing number of complete temperate phage genomes by more than 100-fold. The wet-lab experiments demonstrated that TemPhD can accurately determine the complete genome sequences of the temperate phages, with exact flanking sites, outperforming other state-of-the-art prophage prediction methods. Our analysis indicates that temperate phages are likely to function in the microbial evolution by (i) cross-infecting different bacterial host species; (ii) transferring antibiotic resistance and virulence genes and (iii) interacting with hosts through restriction-modification and CRISPR/anti-CRISPR systems. This work provides a comprehensively complete temperate phage genome database and relevant information, which can serve as a valuable resource for phage research.

15.
J Comput Biol ; 29(10): 1095-1103, 2022 10.
Artículo en Inglés | MEDLINE | ID: mdl-35984993

RESUMEN

The detection and classification of nuclei play an important role in the histopathological analysis. It aims to find out the distribution of nuclei in the histopathology images for the next step of analysis and research. However, it is very challenging to detect and localize nuclei in histopathology images because the size of nuclei accounts for only a few pixels in images, making it difficult to be detected. Most automatic detection machine learning algorithms use patches, which are small pieces of images including a single cell, as training data, and then apply a sliding window strategy to detect nuclei on histopathology images. These methods require preprocessing of data set, which is a very tedious work, and it is also difficult to localize the detected results on original images. Fully convolutional network-based deep learning methods are able to take images as raw inputs, and output results of corresponding size, which makes it well suited for nuclei detection and classification task. In this study, we propose a novel multi-scale fully convolution network, named Cell Fully Convolutional Network (CFCN), with dilated convolution for fine-grained nuclei classification and localization in histology images. We trained CFCN in a typical histology image data set, and the experimental results show that CFCN outperforms the other state-of-the-art nuclei classification models, and the F1 score reaches 0.750.


Asunto(s)
Algoritmos , Redes Neurales de la Computación , Núcleo Celular/patología , Procesamiento de Imagen Asistido por Computador/métodos , Aprendizaje Automático
16.
Comput Biol Chem ; 99: 107735, 2022 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-35850048

RESUMEN

The development of third-generation sequencing technology has brought significant changes and influences on genomics. Compared to the second-generation sequencing methods, the third-generation technologies produce around 100 times longer reads to reveal new genomic variations that complete long-term gaps in the human reference genome. However, these reads' excessive length and high error rate severely increase the amount of data and alignment cost. The traditional data analysis platform and serial sequence alignment method can not effectively deal with large-scale long read alignment. There is a critical need for a novel data analysis platform that can deliver fast alignment of large-scale sequences to solve the problem of long read alignment. High-performance computing platforms and efficient, scalable algorithms based on these platforms have significant potential to impact sequence analysis approaches. This paper presented minimapR, a multi-level parallel long-read alignment tool based on minimap2, a popular third-generation read aligner. MinimapR is developed based on the new high-performance distributed framework Ray. Ray fully integrates with the Python environment and can be easily installed with pip. MinimapR can utilize the power of multiple computing nodes, significantly accelerating alignment speeds without sacrificing sensitivity. The minimapR tool was tested on 64 nodes and demonstrated a 50 fold increase in speed with 78 % parallel efficiency. The source code and user manual of minimapR are freely available at https://github.com/Geehome/minimapR.


Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento , Programas Informáticos , Algoritmos , Genoma Humano , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , Alineación de Secuencia , Análisis de Secuencia de ADN/métodos
17.
Biology (Basel) ; 11(5)2022 May 20.
Artículo en Inglés | MEDLINE | ID: mdl-35625505

RESUMEN

Increasing evidence has suggested that microRNAs (miRNAs) are significant in research on human diseases. Predicting possible associations between miRNAs and diseases would provide new perspectives on disease diagnosis, pathogenesis, and gene therapy. However, considering the intrinsic time-consuming and expensive cost of traditional Vitro studies, there is an urgent need for a computational approach that would allow researchers to identify potential associations between miRNAs and diseases for further research. In this paper, we presented a novel computational method called SMMDA to predict potential miRNA-disease associations. In particular, SMMDA first utilized a new disease representation method (MeSHHeading2vec) based on the network embedding algorithm and then fused it with Gaussian interaction profile kernel similarity information of miRNAs and diseases, disease semantic similarity, and miRNA functional similarity. Secondly, SMMDA utilized a deep auto-coder network to transform the original features further to achieve a better feature representation. Finally, the ensemble learning model, XGBoost, was used as the underlying training and prediction method for SMMDA. In the results, SMMDA acquired a mean accuracy of 86.68% with a standard deviation of 0.42% and a mean AUC of 94.07% with a standard deviation of 0.23%, outperforming many previous works. Moreover, we also compared the predictive ability of SMMDA with different classifiers and different feature descriptors. In the case studies of three common Human diseases, the top 50 candidate miRNAs have 47 (esophageal neoplasms), 48 (breast neoplasms), and 48 (colon neoplasms) are successfully verified by two other databases. The experimental results proved that SMMDA has a reliable prediction ability in predicting potential miRNA-disease associations. Therefore, it is anticipated that SMMDA could be an effective tool for biomedical researchers.

18.
Brief Bioinform ; 23(3)2022 05 13.
Artículo en Inglés | MEDLINE | ID: mdl-35443040

RESUMEN

Target prediction and virtual screening are two powerful tools of computer-aided drug design. Target identification is of great significance for hit discovery, lead optimization, drug repurposing and elucidation of the mechanism. Virtual screening can improve the hit rate of drug screening to shorten the cycle of drug discovery and development. Therefore, target prediction and virtual screening are of great importance for developing highly effective drugs against COVID-19. Here we present D3AI-CoV, a platform for target prediction and virtual screening for the discovery of anti-COVID-19 drugs. The platform is composed of three newly developed deep learning-based models i.e., MultiDTI, MPNNs-CNN and MPNNs-CNN-R models. To compare the predictive performance of D3AI-CoV with other methods, an external test set, named Test-78, was prepared, which consists of 39 newly published independent active compounds and 39 inactive compounds from DrugBank. For target prediction, the areas under the receiver operating characteristic curves (AUCs) of MultiDTI and MPNNs-CNN models are 0.93 and 0.91, respectively, whereas the AUCs of the other reported approaches range from 0.51 to 0.74. For virtual screening, the hit rate of D3AI-CoV is also better than other methods. D3AI-CoV is available for free as a web application at http://www.d3pharma.com/D3Targets-2019-nCoV/D3AI-CoV/index.php, which can serve as a rapid online tool for predicting potential targets for active compounds and for identifying active molecules against a specific target protein for COVID-19 treatment.


Asunto(s)
Tratamiento Farmacológico de COVID-19 , Aprendizaje Profundo , Antivirales/farmacología , Antivirales/uso terapéutico , Reposicionamiento de Medicamentos , Humanos , Simulación del Acoplamiento Molecular , SARS-CoV-2
19.
Brief Bioinform ; 23(2)2022 03 10.
Artículo en Inglés | MEDLINE | ID: mdl-35180781

RESUMEN

Although there are a large number of structural variations in the chromosomes of each individual, there is a lack of more accurate methods for identifying clinical pathogenic variants. Here, we proposed SVPath, a machine learning-based method to predict the pathogenicity of deletions, insertions and duplications structural variations that occur in exons. We constructed three types of annotation features for each structural variation event in the ClinVar database. First, we treated complex structural variations as multiple consecutive single nucleotide polymorphisms events, and annotated them with correlation scores based on single nucleic acid substitutions, such as the impact on protein function. Second, we determined which genes the variation occurred in, and constructed gene-based annotation features for each structural variation. Third, we also calculated related features based on the transcriptome, such as histone signal, the overlap ratio of variation and genomic element definitions, etc. Finally, we employed a gradient boosting decision tree machine learning method, and used the deletions, insertions and duplications in the ClinVar database to train a structural variation pathogenicity prediction model SVPath. These structural variations are clearly indicated as pathogenic or benign. Experimental results show that our SVPath has achieved excellent predictive performance and outperforms existing state-of-the-art tools. SVPath is very promising in evaluating the clinical pathogenicity of structural variants. SVPath can be used in clinical research to predict the clinical significance of unknown pathogenicity and new structural variation, so as to explore the relationship between diseases and structural variations in a computational way.


Asunto(s)
Aprendizaje Automático , Polimorfismo de Nucleótido Simple , Exones , Humanos , Anotación de Secuencia Molecular , Virulencia
20.
Interdiscip Sci ; 14(1): 15-21, 2022 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-35066811

RESUMEN

The coronavirus disease (COVID-19) has led to an rush to repurpose existing drugs, although the underlying evidence base is of variable quality. Drug repurposing is a technique by taking advantage of existing known drugs or drug combinations to be explored in an unexpected medical scenario. Drug repurposing, hence, plays a vital role in accelerating the pre-clinical process of designing novel drugs by saving time and cost compared to the traditional de novo drug discovery processes. Since drug repurposing depends on massive observed data from existing drugs and diseases, the tremendous growth of publicly available large-scale machine learning methods supplies the state-of-the-art application of data science to signaling disease, medicine, therapeutics, and identifying targets with the least error. In this article, we introduce guidelines on strategies and options of utilizing machine learning approaches for accelerating drug repurposing. We discuss how to employ machine learning methods in studying precision medicine, and as an instance, how machine learning approaches can accelerate COVID-19 drug repurposing by developing Chinese traditional medicine therapy. This article provides a strong reasonableness for employing machine learning methods for drug repurposing, including during fighting for COVID-19 pandemic.


Asunto(s)
Tratamiento Farmacológico de COVID-19 , Reposicionamiento de Medicamentos , Reposicionamiento de Medicamentos/métodos , Humanos , Aprendizaje Automático , Pandemias , SARS-CoV-2
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...